COMP124 � Week 3 Stacks

Nested Calls

One option to handle subroutines is to store the return address in a special register in the CPU, and when you need to return just copy that register into the instruction pointer, but then you can't handle nested calls, and most reasonably complex programs use multiple subroutines that could call each other; as soon as the second call happens, the return address for the first call would be overwritten.

To overcome this problem, the return addresses should be pushed to a stack at the top of the running program's memory, which is treated separately and is accessed directly through CPU registers

Machine Level Stacks

Stacks have two operations: Push, and Pop
The Intel x86 architecture uses main memory for the stack but accesses it via a register
ESP - stack pointer, always points to the memory address of the top item on the stack

Remember: the stack grows downwards in memory. As the program code is at the bottom of the memory and goes up, from position 0 to 1 to 2, etc, the stack of the program starts from the top of the program's allocated memory and works it's way down

The push instruction:

Decrements ESP so it points to the next free area of memory on the stack
Writes the data item to that address
The pop instruction:
Moves the data addressed by ESP into the given register
Increments ESP by the correct amount to remove the item from the stack
Note that the data stays in memory until overwritten, the stack pointer just moves to forget about it
The programmer must take care to tidy the stack and ensure items are removed when no longer needed (pop whatever has been pushed)

Call and Return

The call instruction:

Takes the current value of EIP and pushes it onto the stack
Puts the address of subroutine into EIP
The ret instruction:
Pops top item off the stack and places it into EIP

This allows for nested subroutines with correct, maintained return addresses

Manipulating the Stack Pointer

ESP can be changed directly from the code
eg. to take 8 bytes off the stack: add esp, 8
Note that you can inspect any data on the stack as an offset to ESP

Subroutine Parameters

Pass By Value

A simple subroutine like this uses pass by value (values copied into registers)

; SUB bigger
bigger: cmp eax, ebx
		jl second
		ret
second: mov eax, ebx
		ret
; END bigger
...
mov eax, num1
mov ebx, num2
call bigger
mov max, eax

^ to execute max = maximum(num1, num2)
This depends on the caller and callee agreeing on which registers to use for the parameters and return value

Pass By Reference

A function that for example swaps two variables needs the memory locations, not just the values, so memory addresses are needed as parameters (pass by reference)

; SUB swap
swap:   mov ecx, [eax]
		mov edx, [ebx]
		mov [ebx], ecx
		mov [eax], edx
		ret
; END swap
...
lea eax, num1
lea ebx, num2
call swap

The caller and the callee still need to agree on which registers to use

(Note)
Intel x86 has an instruction that can swap values inside two registers
xchg eax, ebx
One operand can be a memory label but this is slow due to locking (a concurrency issue)

Stacking Parameters

If many parameters are needed, or registers are already in use for other data, you can stack the parameters:

Caller pushes parameters before making the call
Callee pops parameters and uses them
Stack must be tidied up
Both caller and callee need to agree on order of parameters and who tidies the stack

For example, for an rectangle area function:
Callee cleans the stack (stdcall)

; SUB area
area:   pop ebx ; the return address
		pop edx
		pop eax
		mul edx
		push ebx ; the return address
		ret
; END area
...
push width
push height
call area
mov result, eax
...

Caller cleans the stack (cdecl)

; SUB area
area:   mov eax, [esp+4]
		mult [esp+8]
		ret
; END area
...
push width
push height
call area
add esp, 8
mov result, eax
...

Calling Conventions

Caller and Callee must agree on a calling convention

Caller pushes parameters to stack in a given order
If callee tidies, it must pop parameters as it uses them
If caller tidies, callee must access parameters via ESP offsets
Return value location must be pre-agreed (EAX in the above examples)

Immediately after the call instruction is executed:

The return address will be the top thing on the stack
If callee is cleaning stack, it must pop/save the address then push it back at the end
If caller is cleaning stack, callee must include it in stack offset calculations

If the return address is forgotten about, the program will not work as intended

Intel x86 Calling Conventions

Intel x86 architecture defines four calling conventions

cdecl - Push parameters on stack in reverse order (right to left); caller cleans stack
fastcall - First two parameters in ECX/EDX, rest reversed on stack; callee cleans stack
stdcall - Push parameters on stack in reverse order; callee cleans stack
thiscall - First parameter in ECX, rest reversed on stack; callee cleans stack

fastcall and thiscall conventions are 'faster' if there are less parameters, but they pollute the registers that may be better used for something else

C library routines expect the programmer to use the cdecl convention.

I/O

I/O is hard in pure assembly, so instead:

Specific registers are used to point to the address of the data in memory
Trigger a CPU interrupt to pass control to the OS
Device drivers perform I/O by liaising with hardware

External subroutines (C library code) can be called in the same way as assembly subroutines, but we must follow cdecl

printf - Send formatted output to the console
scanf - Wait for input from the console

Program Output

To output things, printf is used:

It takes a string or literal as its first parameter
We will pass the address of the string by reference
Following the cdecl convention:
Push the parameter to the stack
Use pass by reference
Clean up the stack afterwards

For example, to output a message:

#include <stdio.h>
#include <stdlib.h>

int main (void) {
	char msg[] = "Hello World\n";
	_asm {
		lea eax, msg
		push eax
		call printf
		pop eax
	}
	return 0;
}

Corrupted Registers

We don't know exactly what happens inside any external subroutines, but it will probably make use of registers which would overwrite them, so any register values that are important to the code must be saved before (use the stack)

Using the Stack

Save things onto the stack and then restore them after the external call returns
For example, for a program to output the string 10 times, the loop counter (ECX) must be maintained:

		mov ecx, 10
floop:  push ecx        ;counter onto the stack
		lea eax, msg
		push eax
		call printf
		pop eax
		pop ecx     ;bring the counter back
		loop floop

Doing this ensures that the code will work as intended, even if the external subroutine uses ECX, as we save its value and restore it after the external subroutine terminates, but before it is used by the loop instruction

Outputting Values

The printf subroutine can take extra parameters that store values to be outputted
- Each parameter inserted into the string in place of a format specifier
- Inserted in the order that they appear in the parameter list
Eg. to output someone's name and age:
- Param 1: "I am %s and I am %d years old\n"
- Param 2: "Bob"
- Param 3: 21
Assuming parameters are passed in the correct order, this would output:
- "I am Bob and I am 21 years old"

Format Specifiers

%d - Display as a decimal integer
%s - Display as a string
%c - Display as a single character
%f - Display as a floating-point number

Parameters must match the specifiers in the string

Types must match
Number of parameters must match
Parameters in correct order
If this is not done correctly, the assembly code will just crash

For example:

char msg[] = "The number is %d\n";
int num = 7;
_asm {
	push num      // Parameters pushed in reverse order (cdecl)
	lea eax, msg
	push eax
	call printf
	add esp, 8
}

Adding to ESP is a quick way to clean up multiple parameters at once

Program Input

To input data, scanf is used
It takes two parameters:

Param 1: A format specifier to indicate the type of data
Param 2: The memory address where the data should be stored

Following the cdecl convention:

Push the parameters to the stack in reverse order
Use pass by reference
Clean up the stack after

Strings can be taken as input if care is taken to reserve enough memory

Use the %s format specifier
Declare a char array that is big enough to store what they enter (easy to overflow)

For example:

char fmt[] = "%d";
int num;
_asm {
	lea eax, num // Remember the address is needed, not the value
	push eax     // Params pushed in reverse order
	lea eax, fmt
	push eax
	call scanf
	add esp, 8
}

Stacking Local Variables

In high level languages, subroutines can have local (internal) variables that only exist while the subroutine is active. This can be done in assembly using the stack

Stack Frames

Each time a subroutine is called, a new stack frame is created on the stack
This holds data that is needed by the subroutine:

Parameters
Return address
Local variables
With nested calls, several stack frames will be present on the stack
Along with ESP, the CPU has another register EBP (stack base pointer)
This always points to the start of the current stack frame
Can be used to access parameters and local variables using an offset
eg. EBP-4 is the address of the second parameter that was pushed

Building the Stack Frame

ESP always points to the top of the stack
EBP initially points to the base of the stack
When a subroutine is called:

Parameters are pushed to the stack first
Then the return address is pushed
Then value of EBP is pushed
Local variables reserved on stack (causing ESP to change)
Current value of ESP is put into EBP (to begin a new stack frame)
When a subroutine is ready to return:
Remove any local variables from the stack
Pop top value into EBP (restore previous stack frame)
Pop top value into EIP (move execution back to caller)
Caller is responsible for cleaning any parameters still on the stack

Nested Calls and Stack Frames

If a subroutine calls another nested subroutine
The stack grows as a stack frame is built up
- Parameters
- Return address
- Old base pointer
- Local variables
Values of EBP and ESP change as the calls happen
- ESP always points to the top of the stack
- EBP changes with each subroutine call
Stack is cleaned up (gets smaller) as each subroutine returns